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Abstract 



In this paper we relate the matrix ©g of the second moments of a spherically truncated 
normal multivariate to its full covariance matrix £ and present an algorithm to invert the 
relation and reconstruct T, from ©g. While the eigenvectors of S are left invariant by the 
truncation, its eigenvalues are non-uniformly damped. We show that the eigenvalues of £ can 
be reconstructed from their truncated counterparts via a fixed point iteration, whose convergence 
we prove analytically. The procedure requires the computation of multidimensional Gaussian 
integrals over a Euclidean ball, for which we extend a numerical technique, originally proposed 
by Ruben in 1962, based on a series expansion in chi-square distributions. In order to study 
the feasibility of our approach, we examine the convergence rate of some iterative schemes on 
suitably chosen ensembles of Wishart matrices. 

1 Introduction 

It is more than forty years since Tallis pQ worked out the moment-generating function of a normal 
multivariate X = {Xk}'" k=l ~ J\f v (0,T,), subject to the conditional event 



The perfect match between the symmetries of the ellipsoid £ v (p;T l ) and those of A/"„(0, S) allows 
for an exact analytic result, from which the complete set of multivariate truncated moments can be 
extracted upon differentiation. Consider for instance the matrix &s(p; X) of the second truncated 
moments, expressing the covariances among the components of X within £ v (p;T,). From Tallis' 
paper it turns out that 



Xe£ v {p; E) 



£ v (p; S) = {x G K" : x T £ _1 x < p} . 




6 £ (p;Z) = c T (p)Z 



Fy+2(P) 

Fv{p) 



(1.2) 
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with F v denoting the cumulative distribution function of a x 2 -variable with v degrees of freedom. 



Inverting eq. ( 1.2 ) - so as to express S as a function of &s - is trivial, since c T (p) is a scalar damping 
factor independent of E. In this paper, we shall refer to such inverse relation as the reconstruction 
of E from &£. Unfortunately, life is not always so easy. In general, the effects produced on the 
expectation of functions of X by cutting off the probability density outside a generic domain Pel" 
can be hardly calculated in closed form, especially if the boundary of T> is shaped in a way that 



breaks the ellipsoidal symmetry of A/^(0, E). Thus, for instance, unlike eq. (1.2) the matrix of the 
second truncated moments is expected in general to display a non-linear /non-trivial dependence 
upon E. 

In the present paper, we consider the case where I? is a ^-dimensional Euclidean ball with 
center in the origin and square radius p. Specifically, we discuss the reconstruction of S from the 
matrix of the spherically truncated second moments. To this aim, we need to mimic Tallis' 



calculation, with eq. (1.1) replaced by the conditional event 



X € B v (p) , B v {p) = {x G R v : x T x < p} . (1.3) 

This is precisely an example of the situation described in the previous paragraph: although B v {p) 
has a higher degree of symmetry than £ v (p; E), still there is no possibility of obtaining a closed-form 
relation between E and since B v (p) breaks the ellipsoidal symmetry of J\f v (0, E): technically 
speaking, in this case we cannot perform any change of variable under the defining integral of the 
moment-generating function, which may reduce the dimensionality of the problem, as in Tallis' 
paper. 

In spite of that, the residual symmetries characterizing the truncated distribution help simplify 
the problem in the following respects: i) the reflection invariance of the whole set-up still yields 
E[Xfc|X G B v (p)] = Vfc, and ii) the rotational invariance of B v (p) preserves the possibility 
of defining the principal components of the distribution just like in the unconstrained case. In 
particular, the latter property means that &g and E share the same orthonormal eigenvectors. In 
fact, the reconstruction of E from Gg amounts to solving a system of non-linear integral equations, 
having the eigenvalues A = {Afc}^! =1 of E as unknown variables and the eigenvalues p = {pk}k=i 
of &s as input parameters. In a lack of analytic techniques to evaluate exactly the integrals 
involved, we resort to a numerical algorithm, of which we investigate feasibility, performance and 
optimization. 

Here is an outline of the paper. In sect. 2, we show that the aforementioned integral equations 
have the analytic structure of a fixed point vector equation, that is to say A = T(A). This suggests 
to achieve the reconstruction of A numerically via suitably chosen iterative schemes. In sect. 3, we 
prove the convergence of the simplest of them by inductive arguments, the validity of which relies 
upon the monotonicity properties of ratios of Gaussian integrals over B v (p). In sect. 4, we review 
some numerical techniques for the computation of Gaussian integrals over B v (p) with controlled 
systematic error. These are based on and extend a classic work by Ruben [2] on the distribution 
of quadratic forms of normal variables. For the sake of readability, we defer proofs of statements 
made in this sect, to Appendix A. In sect. 5, we report on our numerical experiences. Since the 
simplest iterative scheme, namely the Gauss- Jacobi iteration, is too slow for practical purposes, 
we investigate the performance of its improved version based on over-relaxation. As expected, we 
find that the latter has a higher convergence rate, yet it slows down polynomially in 1/ p as p — > 
and exponentially in v as v — > oo. In order to reduce the slowing down, we propose an acceleration 
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technique, which boosts the higher components of the eigenvalue spectrum. A series of Monte Carlo 
simulations enables us to quantify the speedup. To conclude, we summarize our findings in sect. 6. 



2 Definitions and set— up 

Let X € W be a random vector with jointly normal distribution N" v (0, S) in v > 1 dimensions. 
The probability that X falls within B v {p) is measured by the Gaussian integral 

= F[XeB v (p)] = ^ ^ d ^ expj-Vrtj . (2.1) 

Since £ is symmetric positive definite, it has orthonormal eigenvectors = AjfW . Let us denote 
by R ee {v^}ij =1 the orthogonal matrix having these vectors as columns and by A = diag(A) = 
R T T,R the diagonal counterpart of S. From the invariance of B v (p) under rotations, it follows that 
a depends upon £ just by way of A. Accordingly, we rename the Gaussian probability content of 
B v (p) as 

a(p;X) = [ d v x TT 5(x m ,X m ) , S(y,rj) = — =^exp{-^-) . (2.2) 

Jb v (p) ^ii V2^? I 2?? J 



Note that eq. (2.2) is not sufficient to fully characterize the random vector X under the spherical 
constraint, for which we need to calculate the distribution law ¥[X E A\X £ B v {p)\ as a function 
of A C R". Alternatively, we can describe X in terms of the complete set of its truncated moments 

m fcl ...^(p;S) ee E[Xp ...Xfr\Xe B v (p)) , {fci}i=i,...,« G N" , (2.3) 
As usual, these can be all obtained from the moment-generating function 

a-m(t)=- — - * -j= [ d v x explt T x- 1 x T ^- 1 x\ , t £ W , (2.4) 
(2vr)-/ 2 |S| 1 /2 J Bv{p) I 2 J ' 

by differentiating the latter an arbitrary number of times with respect to the components of t, viz. 

Qk 1 + ...+k v m ^ 



(dti) fc i ...(dt v ) k - 



(2.5) 

t=o 

It will be observed that m{t) is in general not invariant under rotations of t. Therefore, unlike 
a, the moments m kl ,,, kv depend effectively on both A and R. For instance, for the matrix of the 
second moments &q ee {d 2 m/dtidtj\t=o}i j = \ such dependence amounts to 

Y[S(x m ,X m ). (2.6) 

k,e=i Jb ^p) m=l 

By parity, the only non-vanishing terms in the above sum are those with k = £. Hence, it follows 
that £ and share R as a common diagonalizing matrix. In other words, if M = diag(/i) is the 
diagonal matrix of the eigenvalues of ©g, then M = R t GqR. Moreover, p k is related to A^ by 

f x 2 " 

/"fc = A fc — , a k (p;X) = / d^x IT 5(x k , X k ) , k = l,...,v. (2.7) 
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The ratios oik /a are naturally interpreted as adimensional correction factors to the eigenvalues 



of S, so they play the same role as c T (p) in eq. (1.2). However, ct^/a depends explicitly on the 
subscript k, thus each eigenvalue is damped differently from the others as a consequence of the 
condition X G B v {p). 



Remark 2.1. In practical terms, eqs. (2.6)-(2.1) tell us that estimating the sample covariance 



matrix of X ~ AA(0, X) from a spherically truncated population {x^}^ =1 , made of N indepen- 
dent units, via the classical estimator Qij = (N — l)" 1 X^nLiC^i"^ — %^)( x ^j — x j)> being 
x = iV _1 Y2n=i x ^ the sample mean, yields a biased result. Nonetheless, the bias affects only 
the eigenvalues of the estimator, whereas its eigenvectors are unbiased. 



Eqs. (2.2) and (2.7) suggest to introduce a general notation for the Gaussian integrals over 



B v (p), under the assumption £ = A. So, we define 



2 2 9 

/y>4-> (Yi^ fvvi 
•1*1, dj Q dj 



a Mm ...(p; A) = / d v x^^-^...H 5(x m , X m ) , (2.1 



b v { p ) A fe \i A 



m , 
m=l 



with each subscript q on the Z./i.s. addressing an additional factor of x 2 /X q under the integral sign 
on the r.h.s. (no subscripts means a). We shall discuss several analytic properties of such integrals 
in ref. [3]. Here, we lay emphasis on some issues concerning the monotonicity trends of the ratios 
ak/a. Specifically, 



Proposition 2.1 (monotonicities). Let X^ = {Aj}*!^ v denote the set of the full eigenval 
without Afe. The ratios ct k ja fulfill the following properties: 

Pi) Afc — (p; A) is a monotonic increasing function of Xk at fixed p and X( k \; 

P2) —(p;X) is a monotonic decreasing function of Xk at fixed p and Am; 

p%) — (p; A) is a monotonic decreasing function of Xi (i 7^ k) at fixed p and X(i), 

where an innocuous abuse of notation has been made on writing ^(p; A) in place ofak(p', X)/a(p; A). 

Proof. Let the symbol dk = d/dXk denote a derivative with respect to A/%. In order to prove 
property p\), we apply the chain rule of differentiation to XkCtk/ct and then we pass dk under the 
integral sign in c^a and dkOtk- In this way, we obtain 

dk (\k^) = l(°f-§)=^2 I X G B v {p)\ - E[Xl I X G B v {p)f) 



ues 



2X ^ & x{Xl\XeB v {p))>Q. (2.9) 



1 

<k 



Moreover, since the truncated marginal density of X 2 is positive within a set of non-zero measure 
in M., the monotonic trend of XkOLkjct i n A& is strict. □ Properties P2) and ^3) are less trivial than 
Pi). Indeed, the same reasoning as above now yields on the one hand 

Xkd k (-) = d k (X k ^) - - = ^ {var (X 2 k \ X G B v (p)) - 2X k E[X 2 k \ X G B v (p)}} , (2.10) 

\ OL / \ OL / OL k 
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and on the other 

Ai8i (?H - 1?) = vk™ {xl xl 1 x e BM) { -* k) - (2 - n) 



Though not a priori evident, the r.h.s. of both eqs. ( 2.10[ )— ( 2.11 ) is negative (and vanishes in the 



limit p — > oo). A systematic discussion of the inequalities var(A^r) < 2AfcE[A^;] and cov(Xj , X^) < 
within a Euclidean ball is out of the scope of the present paper. Accordingly, we take them here as 
a conjecture and refer the reader to ref. [3j for a detailed analysis. However, their meaning should 
be intuitively clear. The variance inequality quantifies the squeezing affecting Xi as a consequence 
of the truncation (in the unconstrained case it would be var(X|) = 2A?). The covariance inequality 
follows from the opposition arising among the square components in proximity of the boundary of 
B v {p). Indeed, if Xj /* p, then X| \ M k ^ j in order for X to stay within B v (p). We warn 
the reader that a quantitative analysis of the square correlation inequalities is not trivial (see e.g. 
refs. [HE] for a discussion in the case where the probability distribution of X is flat instead of being 
normal) . □ 



A consequence of Proposition 2.1 is represented by the following 



Corollary 2.1. Given v, p and X, pk is bounded by 

9 </*<£, r(,,,) = (2, + l)^±^4, (2.12) 



r 1 V > 2A fe 



_p\- ™ ~ 3' v '' v ' M(v,v + 3/2,z) 

2X k J 

with M denoting a Kummer function, viz. 



„»!(&)» 1 ' r(i) 

71=0 



Proof. The upper bound of eq. (2.12) corresponds to the value of pt in the v-tuple limit Xk — > oo, 
X(k) — >• {0 . . . , 0}. This is indeed the maximum possible value allowed for pk according to properties 
pi) and ps) of Proposition |2.1| In order to perform this limit, we observe that 



lim <%,r?) =<%), (2.14) 

with the 5 symbol on the r.h.s. representing the Dirac delta function (the reader who is not familiar 
with the theory of distributions may refer for instance to ref. [B] for an introduction). Accordingly, 

[+Vp I [+Vp p 

lim lim Mft = / ^ x k x k / ^- x k = ~ • (2-15) 

A fc ^oo A {fc) ^{0,...,0} ' / J^^p 3 

The lower bound corresponds instead to the value taken by pk as X^) — > {oo . . . , oo} and A^ is 
kept fixed. In this limit, all the Gaussian factors in the probability density function except the k th 
one flatten to one. Hence, 



lim pk = lim 



dx k x\ 5(x k , Xk) cS'"' 1 ' (p - x k ; A (fc) ) 



A (fc) ^{oo...,oo} A (fc) -^{oo...,oo} f+VP (v-1) / 2 x \ 

/ dx k d(Xk, Xk) a K > [p- x k ;X( k )) 
J-Vp 
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+v / P 



dx k x k e 2X k (p-x k ) 



2\v— 1 



dxfc x fc e 2X k k (1 — x k 



2\v—l 



dx k e 2A fe (p-^)'' 1 



dx k e 2A fc fc (l — x 



2\v-l 



(2.16) 



Numerator and denominator of the rightmost ratio are easily recognized to be integral representa- 
tions of Kummer functions (see e.g. ref. ch. 13]). □ 



The upper bound of eq. (2.12) can be sharpened, as clarified by the following 

Proposition 2.2 (Bounds on the truncated moments). Let v, p and A be given. If {ii, ■ ■ ■ ,i v } is 
a permutation of {1, ... ,v} such that p% x < m 2 < . .. < m v , then the following upper bounds hold: 



J2^k < p; 

k=l 



P 



, k = 1, . . . , v . 



v-k + 1 

Proof. The overall upper bound on the sum of truncated moments follows from 

S(x m , X m ) < p. 

At the same time, the sum can be split and bounded from below by 

v 71 V n 

= + Yl f^u - Yl Vi* + ( v - n )Pi n+ i , n = 0, 



(2.17) 



v 1 v l . / V \ V 

= -£ A ^ = a B{p f X \Y x l) II' 

k=l k=l J Bv{p) \k=l / m=l 



(2.18) 



(2.19) 



fc=l 



fc=i 



k=n+l 



k=l 



The single upper bounds on the p k s are then obtained from eqs. (2.18)-(2.19). It will be noted 
that eq. ( 2.17| ) ii) is sharper than the upper bound of eq. (2.12) only for v > 3 and k < v — 2. □ 

From now on, we shall assume with no loss of generality, that the eigenvalues of £ are increas- 
ingly ordered, namely < Ai < • • • < A^ (we can always permute the labels of the coordinate axes, 
so as to let this be the case). An important aspect related to the eigenvalue ordering is provided 
by the following 

Proposition 2.3 (Eigenvalue ordering). Let v, p and A be given. If \\ < A2 < ... < X V) then it 
holds p\ < P2 < • • • < pv as well. 

Proof. In order to show that the spherical truncation does not violate the eigenvalue ordering, we 
make repeated use of the monotonicity properties of Proposition 2.1. Specifically, if i < j then 



Pi = 


h — (p;{Ai,. 
a 


. , Aj, . . 


• , Xj , . . 


■ ,x v }) 




< 


Aj^(p; {Ai, . 


. . , Xj, . . 


. , Xj , . 




increasing monotonicity of Aj — 

a 




— (p;{Ai,. 

a 


■ • > Xj, . 


• > Xj, . 


..,A4) 


<^> exchange symmetry jf>j 


< 


Aj— (p;{Ai,. 
a 


. . , Aj, . . 


. , A j , . 


-,A4) 


<*~> decreasing monotonicity of — 

a 



Pj 



(2.20) 
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Fig. 1 - Allowed regions for /j, in two and three dimensions. In both pictures, bounds are 



determined by eq. (2.12| (corresponding to the square region on the left and the cubic region 
on the right), while eqs. (2.17) i) (corresponding to the simplexes) and ii) are too loose to 
constrain /i. Note that at v — 3, the cubic region touches the simplex at P = (p/3, p/3, p/S). 
At v > 4, eqs. (2.17) i) and ii) become effective. 



where the symbol "<~~»" is used to explain where the inequality sign preceding it comes from, and 
the "exchange symmetry" refers to the formal property of the one-index Gaussian integrals over 
B v (p) to fulfill ai(p; {Ai, . . . , Aj, . . . , X j} . . . , X v }) = aj(p; {Ai, . . . , Xj, . . . , Aj, . . . , X v }). □ 



Let us now focus on eqs. (2.7). They have to be solved in order to reconstruct A from p. 
Formally, if we introduce a family of truncation operators r p : — > (parametrically depending 
on p), such that 

(T p -X) k = X k — (p;X), k = l,...,v, (2.21) 
a 

then the reconstruction of A from p amounts to calculating A = r^ 1 • p. One should be aware that 
T p is not a surjective operator in view of Corollary 2.1 and Proposition 2.2. Therefore, r" 1 is only 
defined within a bounded domain T>{t~ 1 ) = Im(r p ) C H v (p), being 

H v (p) = ^xeR v + : x k < min {^ „_fc + 1 } and ib x k^Pi Vfcj. (2.22) 

Since we cannot exclude the existence of vectors p G W such that p G H v (p) and p P(r~ 1 ), we 
must conclude that T>{r~ l ) could be a proper subset of H v (p). The lack of a full characterization 
of T>(jp l ) does not spoil our work, since we shall always assume in the sequel that p comes from 
the application of r p to some A, thus p G T>{t~ 1 ) by construction. Of course, we can turn the 
above discussion into a necessary condition for the feasibility of the reconstruction of E, as in the 
following (useful for the applications) 

Remark 2.2 (No-go condition). Let p > 0, p G and R be given, such that R T R = RR T = I v 
and det R = 1. If p ^ H v (p), then there exists no X which may be interpreted as the full covariance 
matrix corresponding to && = RMR T via spherical truncation outside B v (p), being M = diag(/i). 
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Now, we observe that eqs. (2.7) can be written in the equivalent form 
A = T(X;p;p) , 

T : RI X RI X R+ ^ El : '//,( A: //; />) - ///, — (/>: A) . /,--!,... 



T k (X;p;p) = Mfe — (p; A) , 



(2.23) 
(2.24) 



Since p and p are (non-independent) input parameters for the covariance reconstruction problem 
(and in order to keep the notation light), in the sequel we shall leave the dependence of T upon p 
and p implicitly understood, i.e., we shall write eq. (2.23) as A = T(X). Hence, we see that the full 



eigenvalue spectrum A is a fixed point of the operator T. This suggests to obtain it as the limit of 
a sequence 



A 



lim A (n) 



n 



0,1,. 



(2.25) 
(2.26) 



provided this can be shown to converge. Note that since a k < a, it follows that T k (\^) > p k Vra, 
so the sequence is bounded from below by p. In particular, this holds for n = 0. Therefore, the 
sequence moves to the right direction at least at the beginning. A formal proof of convergence, 
based on the monotonicity properties stated by Proposition 2.1, is given in next section. 



3 Convergence of the fixed point equation 

We split our argument into three propositions, describing different properties of the sequence \( n '. 
They assert respectively that i) the sequence is component-wise monotonic increasing; ii) the 
sequence is component-wise bounded from above by any fixed point of T; Hi) if T has a fixed point, 
this must be unique. Statements i) and ii) are sufficient to guarantee the convergence of the sequence 
to a finite limit (the unconstrained spectrum is a fixed point of T). In addition, the limit is easily 
recognized to be a fixed point of T. Hence, statement Hi) guarantees that the sequence converges 
to the unconstrained eigenvalue spectrum. We remark that all the monotonicities discussed in 
Proposition 2.1 are strict, i.e. the ratios a k /a have no stationary points at finite p and A, which is 
crucial for the proof. 

Proposition 3.1 (Increasing monotonicity). Given v, p and p G V(t~ 1 ), the sequence A<°) = p, 
^(n+i) _ T{X^), n = 0, 1, . . . is monotonic increasing, viz. A^ n+1 ' 1 > A^ Vfc = 1, . . . ,v. 

Proof. The proof is by induction. We first notice that 

= T k (\^) = T k (p) = p k — {p-p)>p k = \f, k = l,...,v, (3.1) 

the inequality following from a k (p; p) < a(p;p). Suppose now that the property of increasing 
monotonicity has been checked off up to the n th element of the sequence. Then, 

Al n+1) = P k -( P ; A (n) ) > p k -( P ; \ {n - l) ) = \ { k n) , (3.2) 

the inequality following this time from the inductive hypothesis and from properties p%) and p^) of 
Proposition 2.1. □ 
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Proposition 3.2 (Boundedness) . Given v, p and p G T>(r p 1 ), the sequence A^ ' = p, A( n+1 ) = 

T(X^), n = 0, 1, . . . is bounded from above, viz. a[ < At V k = 1, . . . , v, being A* a fixed point of 
T. 

Proof. We proceed again by induction. We first notice that 

k = 1, . . . , v , (3-3) 



A L 0) = Mfc < m*— (p; A *) = A fc > 



the inequality following as previously from a k (p; A*) < a(p; A*). Suppose now that the property of 
boundedness has been checked off up to the n th element of the sequence. Then, 

4 n+1) = l*k-0n AW) = Aj^fa A*)^(p; A<»>) < A* fc ^(p; AW)^(p; A<»>) = A£ , (3.4) 

the inequality following for the last time from the inductive hypothesis and from properties p^) and 
P3) of Proposition 2.1. □ 

According to Propositions 3.1 and 3.2, the sequence converges. Now, let A = limn-^oo A^ n ^ be 
the limit of the sequence. Effortlessly, we prove that A is a fixed point of T. Indeed, 

(n) ,. OS , , ,„ji, a 

1 = lim pk 

ctk" ' ' a k 

Note that passing the limit over n under the integral sign is certainly allowed for Gaussian integrals. 



Lira X£> = hm p k —(p; A^ -1 ^) = p k — (p; lim A^ 1 )) = T k (X) . 



(3.5) 



Proposition 3.3 (Uniqueness of the fixed point). Let X' = T(X') and X" = T(A") be two fixed 
points ofT, corresponding to the same choice of v, p and p £ X^t^ 1 ). Then, it must be X' = X" . 



Proof. According to the hypothesis, A' and A" fulfill the equations 



Afc = Hk — {p; A') 
a k 



a 



Xl = p k -(p;X") 

Ok 



Pfc = X' k — (p; X') 
a 



p k = K°^(p;X"). 
a 



Hence, 



= p k -p k = X' k — (p; A') - A£°%; A") = V [ C dt J u (>; A" + 1 (A' - A")) 
where J denotes the Jacobian matrix of t p and is given by 

J«(p;A) (a^(„ ; a)) _ |£ (3£ - 2«) _ [A-'ntaA)A] u , 

having set = (1/2) (a^ /a — a k ae/a 2 ). It will be noted that Q = {^ke} k e=i * s essentially the 
covariance matrix of the square components of X under spherical truncation (we have come across 



(3.6) 
(3.7) 

(Aj-A?), (3.8) 
(3.9) 



its matrix elements in eqs. (2.9)-(2.11 )). As such, $7 is symmetric positive definite. Indeed, 
1 

2AfcA^ 
1 



n 



ki 



-cov {X 2 kl Xj \X eB v (p)) 



2X k Xc 



-E 



X\ - E [X k I X e B v (p)] ){xj- E[X? I X e B v (p)] 



X e B„(p) 



. (3.10) 
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On setting Z k = (X| - E[X% \ X G B v (p)])/y/2X k , we can represent fl as O = E[ZZ T \ X G B v (p)]. 
IfiGr is not the null vector, then x T nx = E[x T ZZ T x \ X G B v (p)} = E[(x T Z) 2 \ X G B v {p)\ > 0. 
Moreover, the eigenvalues of f2 fulfill the secular equation 

= det(n - <j>I v ) = det[A _1 (fi - 0I„)A] = det(A _1 fiA - <j)l v ) = det(J - (j)l v ) , (3.11) 

whence it follows that J is positive definite as well (though it is not symmetric). Since the sum of 
positive definite matrices is positive definite, we conclude that L d t J (p; X" + t(X' — A")) is positive 



definite too. As such, it is non-singular. Therefore, from eq. (3.8) we conclude that A' = A". □ 



4 Numerical computation of Gaussian integrals over B v (p) 

Let us now see how to compute ay m ._ with controlled precision. Essentially, all the relevant work 
has been originally done by Ruben in ref. [2], where the case of a is thoroughly discussed. We 
just need to extend Ruben's technique to Gaussian integrals containing powers of the integration 
variable. Specifically, it is shown in ref. [2] that a(p; A) can be represented as a series of chi-square 
cumulative distribution functions, 



n 



(p; A) = c m (s; \)F v+2m {p/s) . 



(4.1) 



m=0 



The scale factor s has the same physical dimension as p and A. It is introduced in order to factorize 
the dependence of a upon p and A at each order of the expansion. The series on the r.h.s. of 



eq. (4.1) converges uniformly on every finite interval of p. The coefficients c m are given by 
1 s v / 2+m T(v/2 + m) 



c m (s; A) 



-QT 



m = 0, 1, . . . 



(4.2) 



ml |A|V2 r>/2) 

having defined Q(x) = x T [A _1 — s -1 !^]^ for and M as the uniform average operator on the 

(v — l)-sphere dB v (l) = {u G W : u T u = 1}, viz. 

r>/2) 



^ f du <j,{u) , V cf> G C° (8B V (1)) a.e. 



(4.3) 



Unfortunately, eq. (4.2) is not particularly convenient for numerical computations, since M[(— Q) 7 



is only given in integral form. However, it is also shown in ref. |2j that the coefficients c m can be 
extracted from the Taylor expansion (at zq = 0) of the generating function 



IK*) = n 



fc=i 



1/2 



1 



1 



Afc 



-1/2 



i.e. ip(z) = c m (s; A); 



(4.4) 



m=0 



This series converges uniformly for \z\ < minj |1 — s/Aj| 1 . On evaluating the derivatives of ip(z), 
it is then shown that the c m 's fulfill the recursion 



co 



9n 




n 

m=l 

v 



- n— 1 



(Jn—rCr 



n 



1,2, 



r=0 



(4.5) 



m=l 



S 
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Finally, the systematic error produced on considering only the lowest k terms of the chi-square 
series of eq. (4.1) is estimated by 



TZ n (p;X) 



^ Cm(s; X)F v+ 2m{p/s) 



< co(s; A) r( ;f + n) £(1 - r?)-^ 2 +")^ +2n [(l - V )p/s] =. <R n , 
1 (v/2) n\ 



(4.6) 



with r] = maxj |1 — s/Aj|. 

Now, as mentioned, it is possible to extend the above expansion to all Gaussian integrals otktm...- 
Here, we are interested only in a k and atjk, since these are needed in order to implement the fixed 
point iteration and to compute the Jacobian matrix of r p . The extension is provided by the following 

Theorem 4.1 (Ruben's expansions). The integrals a k and otjk admit the series representations 

(4.7) 



«fe(p;A) = ^2 c k;m (s;X)F v+2 ( m+1 )(p/s) , 

m=0 

oo 

= ^2 Cj k - m (s; X)F v+2 (m+2)(p/s) , 



(4.8) 



m=0 



with s an arbitrary positive constant. The series coefficients are given resp. by 
s v + 2m s v / 2+m T{v/2 + m) 



c k;m{ s ] ^) 



[(-Q) 



CjA;;m(s; A) = (1 + 2<5j fc 



A fc m! |A|V2 T{v/2) 

s s (v + 2m + 2)(v + 2m) W 2+m r(v/2 + m) 



(4.9) 



Xj Afc 



|A|V2 I>/2) 



[(-Q) 



(4.10) 



wrai/i denoting the Kronecker symbol. The series on the r.h.s. of eqs. (4-7)-('4-8) converge 
uniformly on every finite interval of p. The functions 



fpkk(z) = 3 



A J 



3/2 



1- [ 1-— \Z 

Afc 



-3/2 



n 

i^k 



1/2 



1 - 1 1 - a 1 



-1/2 



5/2 



1- 1 



-5/2 



n 

i^k 



1/2 



1-1 



-1/2 



(4.11) 

(4.12) 



Xj Xk J 



1-1 



1-1 1-— \z 

Afc 



-3/2 



1/2 



1 " * 1 " AT ' 



-1/2 



(j + k) 



(4.13) 
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are generating functions resp. for the coefficients Cfc ;m , Ckk-,m an d Cjh;m (j ^ J - e - they fulfill 

oo 

M*)= ^c fc;m (s;A)z m , (4.14) 

m=0 

oo 

ipjk(z) = Cj k;m (s; \)z m , 



(4.15) 



m=0 



for \z\ < mirij |1 — s/Aj| . Finally, the coefficients Ck- m , Ckk;m o-nd Cjk- m (j '7^ k) can be obtained 
iteratively from the recursions 



Cfc;0 
9k\m 



m—1 



/ . 9k\m—r Ck\r i 

m 

r=0 

^2 ek 'i ( 1 ~ TT ) ' m > 1 ; 



(4.16) 



and 



Cjfc;0 = (1 + -<\ik) ( . ) ( . I ''(I : ''jlr.m 



^7/ V^fc 



m—1 



2 m 9jk;m-r Cjk;r 



i=l ^ ' 



r=0 



(4.17) 



m > 1 : 



where the auxiliary coefficients e^i and ejk-i are defined by 



3 if i = k 

1 otherwise 

5 if i = k 

1 otherwise 



3 if i = j or k 
1 otherwise 



U ¥= k) 



(4.18) 



(4.19) 



□ 



It is not difficult to further generalize this theorem, so as to provide a chi-square expansion 
for any Gaussian integral c%g m .„. The proof follows closely the original one given by Ruben. We 
reproduce it in Appendix A for ctk, just to highlight the differences arising when the Gaussian 
integral contains powers of the integration variable. 



Analogously to eq. (4.6), it is possible to estimate the systematic error produced when consid- 



ering only the lowest k terms of the chi-square series of and a^. Specifically, we find 



T^kvn 



^ Ck;m(s; X)F v+2 (m+1) (p/s] 



<c k X(l-v)-^ +n+1)F{v/2 + n + 1) 



r>/2) 



F u+2(n+1 )[(l-7y)p/s] = *K fc;n , 



(4.20) 
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7tjk;n = Cjf. ;m (s) X)F v+2 ( m+ 2) (p/ s) 

m=n 

< C jk; 0^(l ~ ^)- ( " /2+n+2) l: ^^ 7 ^ ^ + 2(n + 2)[(l " V)P/S] EE m jk , n . (4.21) 

In order to evaluate all Ruben series with controlled uncertainty, we first selQs = 2XiX v /(Xt + X v ), 
then we choose a unique threshold e representing the maximum tolerable systematic error, e.g. 
£dp = 1-0 x 1CT 14 (roughly corresponding to double floating-point precision), for all a, a k and djk, 
and finally for each ax we compute the integer 

k th = mm{n : <H X; n < e} , (4.22) 

n>l 

providing the minimum number of chi-square terms, for which the upper bound 9ix;n to the resid- 
ual sum TZx-n lies below e. Of course, this procedure overshoots the minimum number of terms 
really required for the TVs to lie below e, since we actually operate on the 9t's instead of the 7£'s. 
Nevertheless, the computational overhead is acceptable, as it will be shown in next section. For 
the sake of completeness, it must be said that typically the values of kth for a, a k and ctjk with the 
same e (and p, X) are not much different from each other. 

To conclude, we notice that kth depends non-trivially upon A. By contrast, since F v {x) is 
monotonic increasing in x, we clearly see that kth i s monotonic increasing in p. Now, should one 
evaluate a and the like for a given A at several values of p, say p\ < p2 < ■ ■ ■ < p m ax, it is 
advisable to save computing resources and work out Ruben coefficients just once, up to the order 
kth corresponding to /9 max , since &th(pi) < • • • < ^th(Pmax)- We made use of this trick throughout 
our numerical experiences, as reported in the sequel. 



5 Numerical analysis of the reconstruction process 



The fixed point eq. (2.25) represents the simplest iterative scheme that can be used in order to 
reconstruct the solution A = r^ 1 • \i. In the literature of numerical methods, this scheme is known 
as a non-linear Gauss-Jacobi (GJ) iteration (see e.g. ref. |8]). Accordingly, we shall rewrite it as 
•^gTa^ = ^fc(^Gj )- ^ s we have seen, the sequence A^j converges with no exception as n — > oo, 
provided p G P(r p _1 ). Given e T > 0, the number of steps na needed for an approximate convergence 
with relative precision e T , i.e. 

( Mx (n) x (n-l)|| 1 

— ■ J \\ A GJ A GJ oo I /rr 1\ 

^ = mm n: < e T , (5.1) 

_ I, II GJ 1 1 oo J 

depends not only upon e T , but also on p and p (note that the stopping rule is well conditioned, 
since ||A^ n ^||oo > Vn and also lim n _ 5 . OC) ||A^||oo > 0). In order to characterize statistically the 
convergence rate of the reconstruction process, we must integrate out the fluctuations of nn due 
to changes of p, i.e. we must average n; t by letting p fluctuate across its own probability space. 
In this way, we obtain the quantity = [ri; t | e T , p] , which better synthesizes the cost of the 



1 see once more ref. [2] for an exhaustive discussion on how to choose 
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reconstruction for given e T and p. It should be evident that carrying out this idea analytically is 
hard, for on the one hand na depends upon p non-linearly, and on the other p has a complicate 
distribution, as we briefly explain below. 



5.1 Choice of the eigenvalue ensemble 

Since A is the eigenvalue spectrum of a full covariance matrix, it is reasonable to assume its 
distribution to be a Wishart W v (p, So) for some scale matrix So and for some number of degrees 
of freedom p > v. In the sequel, we shall make the ideal assumption So = p~ x ■ I v , so that the 
probability measure of A is (see e.g. ref. [9]) 

dw „ fe A) = ^-.v, ^ir M A^»^(-ia.,A t )m <J (A J -A t) d „ A 

y ' 2 v P/ 2 T v {p/2)Y v {v/2) v ' 

Under this assumption, the probability measure of p is obtained by performing the change of 



variable A = r^ 1 • p in eq. (5.2). Unfortunately, we have no analytic representation of r" 1 . Thus, 
we have neither an expression for the distribution of p. However, p can be extracted numerically 
as follows: 

i) generate randomly £ ~ W v (p 1 p^ 1 -I v ) by means of the Bartlett decomposition [TO] : 

ii) take the ordered eigenvalue spectrum A of E; 

Hi) obtain p by applying the truncation operator r p to A. 

Note that since W v (p, p^ 1 ■ !«) is only defined for p > v, we need to rescale p as v increases. The 
simplest choice is to keep the ratio p/v fixed. The larger this ratio, the closer £ fluctuates around 
ILj (recall that if £ ~ W v (p, p~ l ■ i v ), then E[Sy] = 5ij and var(Ey) = [1 + 8ij]). In view of this, 
large values of p/v are to be avoided, since they reduce the probability of testing the fixed point 
iteration on eigenvalue spectra characterized by large condition numbers n cont j = X v /Xx. For this 
reason, we have set p = 2v in our numerical study. 

Having specified an ensemble of matrices from which to extract the eigenvalue spectra, we are 
now ready to perform numerical simulations. To begin with, we report in Fig. [2] the marginal 
probability density function of the ordered eigenvalues {Xk}% =1 and their truncated counterparts 
{Mfc}fc=i f° r the Wishart ensemble Wio(20, 2CU 1 • Iio) at p = 1, as obtained numerically from a 
rather large sample of matrices (~ 10 6 units). It will be noted that i) the effect of the truncation 
is severe on the largest eigenvalues, as a consequence of the analytic bounds of Corollary 2.1 and 
Proposition 2.2; ii) while the skewness of the lowest truncated eigenvalues is negative, it becomes 



positive for the largest ones. This is due to a change of relative effectiveness of eq. (2.17) i) with 



respect to eq. (2.17) ii). 



5.2 Choice of the simulation parameters 

In order to explore the dependence of n- lt upon p, we need to choose one or more simulation points 
for the latter. Ideally, it is possible to identify three different regimes in our problem: p < Ai 



14 





v = 3 


v = 4 


v = 5 


v = 6 


v = 7 


v = 8 


v = 9 


v = 10 


Mo(Ai) 


0.1568 


0.1487 


0.1435 


0.1383 


0.1344 


0.1310 


0.1269 


0.1258 


Mo(A 2 ) 


0.6724 


0.4921 


0.4017 


0.3424 


0.3039 


0.2745 


0.2554 


0.2399 


Mo(A 3 ) 


1.6671 


1.0112 


0.7528 


0.6071 


0.5138 


0.4543 


0.4048 


0.3693 


Mo(A 4 ) 


- 


1.8507 


1.2401 


0.9621 


0.7854 


0.6684 


0.5858 


0.5288 


Mo(A 5 ) 


- 


- 


2.0150 


1.4434 


1.1269 


0.9263 


0.7956 


0.7032 


Mo(A 6 ) 


- 


- 


- 


2.1356 


1.5789 


1.2559 


1.0527 


0.9111 


Mo(A 7 ) 


- 


- 


- 


- 


2.2190 


1.6764 


1.3673 


1.1603 


Mo(A 8 ) 












2.2763 


1.7687 


1.4624 


Mo(A 9 ) 














2.3210 


1.8473 


Mo(Aio) 
















2.3775 



Table 1 - Numerical estimates of the mode of the ordered eigenvalues {Ai, . . . , X v } of S ~ 
W v (2v, (2w) _1 • I„) with v = 3, . . . , 10. The estimates have been obtained from Grenander's 
mode estimator [TT] . 



(strong truncation regime), Ai < p < A^ (crossover) and p > X v (weak truncation regime). We 
cover all of them with the following set of points: 

p G {Mo(Ai), ... ,Mo(A t) )} U j^Mo(Ai), 2Mo(A„)| , (5.3) 

where Mo(-) stands for the mode. In principle, it is possible to determine Mo(Afc) with high 
accuracy by using analytic representations of the marginal probability densities of the ordered 
eigenvalues [12]. In practice, the latter become computationally demanding at increasingly large 
values of v: for instance, the determination of the probability density of A2 requires (vl) 2 sums, which 
is unfeasible even at v ~ 10. Moreover, to our aims it is sufficient to choose approximate values, 
provided these lie not far from the exact ones. Accordingly, we have determined the eigenvalue 
modes numerically from samples of N ~ 10 6 Wishart matrices. Our estimates are reported in 
Table [T] for v = 3, . . . , 10. They have been obtained from Grenander's estimator [llj . 



, 1^=1 \ A k \ \ A k ~ A k , 

Mo(A fe ) rs = V / \ s , (5.4) 

— ^1 — 1 \ k k 

with properly chosen parameters r, s. 

We are now in the position to investigate numerically how many terms in Ruben's expansions 
must be considered as e is set to ea p = 1.0 x 10~ 14 , for our choice of the eigenvalue ensemble 
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A ~ W v (2v, {2v)~ l ■ I v ) and with p set as in Table [Tj As an example, we report in Fig. [3] the 
discrete distributions of fcth for the basic Gaussian integral a at v = 10, the largest dimension 
we have simulated. As expected, we observe an increase of kth with p. Nevertheless, we see that 
the number of Ruben's components to be taken into account for a double precision result keeps 
altogether modest even in the weak truncation regime, which proves the practical usefulness of the 
chi-square expansions. 



5.3 Fixed point iteration at work 

The GJ iteration is too slow to be of practical interest. For instance, at v = 10, p ~ Mo(Ai) and 
e T = 1.0 x 10~ 7 (corresponding to a reconstruction of A with single floating-point precision) it is 
rather easy to extract realizations of p which require ri[ t ~ 15.000 to converge. An improvement of 
the GJ scheme is achieved via over-relaxation (GJOR), i.e. 



x(o) _ 

A GJOR,fc ~~ ^ k ' 
GJOR,/c — A GJOR,fc W 



rp I \ ( n ) ^ \W 
Jfc^GJORj A GJOR,fc 



k = 1, 



(5.5) 



Evidently, at oj = 1 the GJOR scheme coincides with the standard GJ one. The optimal value 
cj op t of the relaxation factor uj is not obvious even in the linear Jacobi scheme, where uj op t depends 
upon the properties of the coefficient matrix of the system. For instance, if the latter is symmetric 
positive definite, it is demonstrated that the best choice is provided by w op t = 2(l + v / T 
being a the spectral radius of the Jacobi iteration matrix [13]. In our numerical tests with the 
GJOR scheme, we found empirically that the optimal value of uj at p < A,, is close to the linear 
prediction, provided a is replaced by ||J||ooi being J defined as in sect. 3 (note that ||J||oo < !)• 
By contrast, the iteration diverges after few steps with increasing probability as p/X v — > 00 if uj is 
kept fixed at uj = cj opt ; in order to restore the convergence, uj must be lowered towards uj = 1 as 
such limit is taken. 

To give an idea of the convergence rate of the GJOR scheme, we show in Fig. [4] (left) a joint 
box-plot of the distributions of n;t at v = 10 and e T = 1.0 x 10~ 7 . From the plot we observe that 
the distribution of nit shifts rightwards as p decreases: clearly, the reconstruction is faster if p is in 
the weak truncation regime (where p is closer to A), whereas it takes more iterations in the strong 
truncation regime. The dependence of n; t upon p, systematically displayed in Fig. [5j is compatible 
with a scaling law 

logn it (p,v,e T ) = a(v,e T ) - b(v,e T )logp, (5.6) 



apart from small corrections occurring at large p. Eq. (5.6) tells us that n; t increases polynomially 



in 1/p at fixed v. In order to estimate the parameters o and b in the strong truncation regime (where 



the algorithm becomes challenging), we performed jackknife fits to eq. (5.6) of data points with 
p < 1. Results are collected in Fig. [4] (right), showing that b is roughly constant, while a increases 
almost linearly in v. Thus, while the cost of the eigenvalue reconstruction is only polynomial in 
1/p at fixed v, it is exponential in v at fixed p. The scaling law of the GJOR scheme is therefore 
better represented by n- lt = Ce KV /p b , with C being a normalization constant independent of p and 
v, and k representing approximately the slope of a as a function of v. Although the GJOR scheme 
improves the GJ one, the iteration reveals to be still inefficient in a parameter subspace, which is 
critical for the applications. 
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Fig. 2 - Monte Carlo simulation of the probability density function of the ordered eigenvalues 
Afe (even rows) and their truncated counterparts jjik at p = 1 (odd rows) for the Wishart 
ensemble Wio(20, 20 _1 • Iio). The last two plots (bottom right) display the distribution of the 
sum of eigenvalues. 
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4. I L 
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p = 1.8473 
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1 1 1 1 1 — 

p =4.7550 
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Fig. 3 - Monte Carlo simulation of the probability mass function of the parameter fc t h for 
the Gaussian probability content a. The histograms refer to the eigenvalue ensemble A of 
S ~ Wi (20, 2CT 1 ■ I10), with p chosen as in Tableland e = 1.0 x 10" 
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I — [J] — I P =4.7550 

I — [|}H P =2.3775 

i — HH p= 1 - 8473 

I — Q]—l p =1.4624 
h^H P = 1.1603 
l-HH P=0.9111 
p =0.7032 1— [J]-| 
p = 0.5288 |-[]}H 
p =0.3693 \-^-\ 

p =0.2399 h-QH 

p = 0.1258 |-[J]H 

p =0.06288 | — [J— | 



V 


a {jk.err.) 


b {jk.err.) 


3 


4.93(3) 


0.940(1) 


4 


5.25(2) 


0.940(1) 


5 


5.44(2) 


0.948(1) 


6 


5.65(1) 


0.953(1) 


7 


5.85(1) 


0.948(1) 


8 


6.02(1) 


0.951(1) 


9 


6.18(1) 


0.946(1) 


10 


6.31(1) 


0.948(1) 



10' 



riu 



Fig. 4 - (left) Box-plot of na in the GJOR scheme at v = 10, with e T = 1.0 x 10 -7 and p 
chosen as in Table [l] The distributions have been reconstructed from a sample of N ~ 10 3 
eigenvalue spectra extracted from Wio(20, 20" 1 -Iio). The whiskers extend to the most extreme 
data point within (3/2)(75% — 25%) data range, (right) Numerical estimates of the scaling 
parameters a and b of the GJOR scheme, as obtained from jackknifc fits to eq. (5.6) of data 
points with p < 1 and e T = 1.0 x 10 -7 . We quote in parentheses the jackknife error. 



5.4 Boosting the GJOR scheme 

A further improvement can be obtained by letting ui depend on the eigenvalue index in the GJOR 
scheme. Let us discuss how to work out such an adjustment. On commenting Fig. [2j we have 
already noticed that the largest eigenvalues are affected by the truncation to a larger extent than 
the smallest ones. Therefore, they must perform a longer run through the fixed point iteration, 
in order to converge to the untruncated values. This is a possible qualitative explanation for the 
slowing down of the algorithm as p — > 0. In view of it, we expect to observe some acceleration of 
the convergence rate, if u) is replaced, for instance, by 



(1 + /3- k)u) opt 



P> o. 



k = 1, 



(5.7) 



The choice (3 = corresponds obviously to the standard GJOR scheme. Any other choice yields 
uik > u) opt . Therefore, the new scheme is also expected to display a higher rate of failures than the 
GJOR one at p ^> X v , for the reason explained in sect. 5.3. The component -wise over-relaxation 



proposed in eq. (5.7) is only meant to enhance the convergence speed in the strong truncation 



regime and in the crossover, where the improvement is actually needed. 
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Fig. 5 - Log-log plots of nit vs. p in the GJOR scheme at e T = 1.0 x 10~ 7 . The parameter p 
has been chosen as in Table [l] The (red) dashed line in each plot represents our best jackknife 
linear fit to eq. (5.6) of data points with p < 1. 
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Fig. 6 - Scaling parameters a and b of the modified GJOR scheme as functions of v at 
e T = 1.0 x 10~ 7 , with j3 varying in the range 0-^2 in steps of 1/5. Each trajectory (represented 
by a dashed curve) refers to a different value of f3. Those with darker markers correspond to 
smaller values of /3 and the other way round. 



In order to confirm this picture, we have explored systematically the effect of (3 on n; t by 
simulating the reconstruction process at v = 3, . . . , 10, with f3 varying from to 2 in steps of 1/5. 
First of all, we have observed that the rate of failures at large p is fairly reduced if the first 30 -f- 50 
iterations are run with ujk = w pt, and only afterwards (3 is switched on. Having minimized the 
failures, we have checked that for each value of f3, the scaling law assumed in eq. (5.6) is effectively 
fulfilled. Then, we have computed jackknife estimates of the scaling parameters a and b. These are 
plotted in Fig. [6] as functions of v. Each trajectory (represented by a dashed curve) corresponds 
to a given value of (3. Those with darker markers refer to smaller values of (3 and the other way 
round. From the plots we notice that 



i) all the trajectories with /3 > lie below the one with f3 = 0; 

ii) the trajectories of a display a clear increasing trend with v, yet their slope lessens as (3 
increases. By contrast, the trajectories of b develop a mild increasing trend with v as (3 
increases, though this is not strictly monotonic; 

Hi) the trajectories of both a and b seem to converge to a limit trajectory as (3 increases; we 
observe a saturation phenomenon, which thwarts the benefit of increasing (3 beyond a certain 
threshold close to /3 max — 2. 



We add that pushing (3 beyond /3 max is counterproductive, as the rate of failures becomes increasingly 
relevant in the crossover and eventually also in the strong truncation regime. By contrast, if 
(3 < /3 max the rate of failures keeps very low for essentially all simulated values of p. 

Our numerical results signal a strong reduction of the slowing down of the convergence rate. 
Indeed, i) means qualitatively that C and b are reduced as f3 increases, ii) means that n is reduced as 
f3 increases (this is the most important effect, as k is mainly responsible for the exponential slowing 
down with v). The appearance of a slope in the trajectories of b as f3 increases indicates that a 
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Fig. 7 - The parameter k as a function of /3. Estimates of k are obtained from least-squares 
fits of data to a linear model a = ao + k ■ v. 

mild exponential slowing down is also developed at denominator of the scaling law n; t = Ce KV /p b , 
but the value of b is anyway smaller than at (3 = 0. Finally, Hi) means that choosing (3 > /? max has 
a minor impact on the performance of the algorithm. In Fig. [7j we report a plot of the parameter 
n (obtained from least-squares fits of data to a linear model a = a® + k ■ v) as a function of {3. 
We see that k(/3 = 0)/k(/3 = 2) ~ 4. This quantifies the maximum exponential speedup of the 
convergence rate, which can be achieved by our proposal. When /3 is close to /3 max , n; t amounts to 
few hundreds at v = 10 and p ~ Ai/2. 



6 Conclusions 

In this paper we have studied how to reconstruct the covariance matrix £ of a normal multivariate 
X ~ A/^(0,S) from the matrix (5g of the spherically truncated second moments, describing the 
covariances among the components of X when the probability density is cut off outside a centered 
Euclidean ball. We have shown that £ and Gq share the same eigenvectors. Therefore, the problem 
amounts to relating the eigenvalues of £ to those of Such relation entails the inversion of 
a system of non-linear integral equations, which admits unfortunately no closed-form solution. 
Having found a necessary condition for the invertibility of the system, we have shown that the 
eigenvalue reconstruction can be achieved numerically via a converging fixed point iteration. In 
order to prove the convergence, we rely ultimately upon some probability inequalities, known in 
the literature as square correlation inequalities, which are only conjectured in this work (they will 
be systematically investigated for our specific set-up in a forthcoming paper [3]). 

In order to explore the convergence rate of the fixed point iteration, we have implemented some 
variations of the non-linear Gauss-Jacobi scheme. Specifically, we have found that over-relaxing 
the basic iteration enhances the convergence rate by a moderate factor. However, the over-relaxed 
algorithm still slows down exponentially in the number of eigenvalues and polynomially in the 
truncation radius of the Euclidean ball. In the last part of the paper, we have shown that a 
significant reduction of the slowing down can be achieved in the regime of strong truncation by 
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adapting the relaxation parameter to the eigenvalue it is naturally associated with, so as to boost 
the higher components of the spectrum. 

A concrete implementation of the proposed approach requires the computation of a set of 
multivariate Gaussian integrals over the Euclidean ball. For this, we have extended to the case 
of interest a technique, originally proposed by Ruben for representing the probability content of 
quadratic forms of normal variables as a series of chi-square distributions. In the paper, we have 
shown the practical feasibility of the series expansion for the integrals involved in our computations. 
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Appendix A Proof of Theorem 4.1 



As already mentioned in sect. 4, the proof follows in the tracks of the original one of ref. [2]. We 
detail the relevant steps for aj., while for ajk we only explain why it is necessary to distinguish 
between equal or different indices and the consequences for either case. 



In order to prove eq. (4.7), we first express a k in spherical coordinates, i.e. we perform the 
change of variable x = ru, being r = ||x|| and u G dB v {l) (recall that d^x = r v ~ 1 dr du, with du 
embodying the angular part of the spherical Jacobian and the differentials of u — 1 angles); then 
we insert a factor of 1 = exp(r 2 /2s) exp(— r 2 /2s) under the integral sign. Hence, atk reads 

1 f-^P r 2 ( r 2 \ f ( Q(u)r 2 \ 

°><m = (27r) */ 2 |A|i/ 2 L V xp \-2 S ) L vW du ^ {-—) • (A - 1} 

The next step consists in expanding the inner exponential in Taylor serie^J 



viz. 



7 m=0 



This series converges uniformly in u. We review the estimate just for the sake of completeness: 



ml 2 r ' 

m=0 



°° 1 r 2m /r 2 nn\ 



m=0 

being qo = maxj |l/s — 1/Aj|. It follows that we can integrate the series term by term. With the 
help of the uniform average operator introduced in eq. (4.3), is recast to 



a k (p;\) = Y — :^TTTT7^ M K-Q) m 4}^ t 1 , , ; / dr r v+2m+1 exp ( -— ] . (A.4) 



2 in his original proof, Ruben considers a more general set-up, with the center of the Euclidean ball shifted by a 
vector b £ R" from the center of the distribution. In that case, the Gaussian exponential looks different and must 
be expanded in series of Hermite polynomials. Here, we work in a simplified set-up, where the Hermite expansion 
reduces to Taylor's. 
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The presence of an additional factor of it? in the angular average is harmless, since |u?| < 1. 
We finally notice that the radial integral can be expressed in terms of a cumulative chi-square 
distribution function on replacing r —> \/rs, namely 



dr r 



v+2m+l 



exp 



2s 



2 v/2 +ms v/2 +m+ l T V +m + 1 ) F 



v+2(m+l) 



(A.5) 



Inserting eq. (A.5) into eq. (A. 4) results in Ruben's representation of a.]-. This completes the first 
part of the proof. 



As a next step, we wish to demonstrate that the function ip^ of eq. (4.11) is the generating 
function of the coefficients c^- m . To this aim, we first recall the identities 



-1/2 



-3/2 



(27T)- 1 / 2 
(27T)- 1 / 2 



— oo 
oo 



dx exp ( — -x* 



dx x exp [ ~2 X ~ 



(A.6) 
(A.7) 



valid for a > 0. On setting m = [1 — (1 — s/\i)z], we see that ip^ can be represented in the integral 
form 



\ 3 / 2 
S \ ,- v /2 



1/2 



,t»/2 



dxfc x\ exp 
dxj exp 



T 2 



1-1 



A fc (27r)-/ 2 |A|V2 



d^xx^, exp I — -zsQ(x 



provided 2 < min^ |1 — s/Aj| _1 . As previously done, we introduce spherical coordinates x = ru, and 
expand exp{ — ^zsQ(x)} = exp{— ^zsr 2 Q(u)} in Taylor series. By the same argument as above, 
the series converges uniformly in u (the factor of zs does not depend on u) , thus allowing term by 
term integration. Accordingly, we have 



.v/2 



A fc (2tt)-/ 2 |A|V 



1/2 



m=0 



2 m m\ J 



drr v+2(m+l)~l e -r 2 /2 



du[-Q(u)ru 2 k . (A.9) 



dB v (l) 



We see that the r.h.s. of eq. (A.9) looks similar to eq. (A. 4), the only relevant differences being the 
presence of the factor of z m under the sum sign and the upper limit of the radial integral. With 
some algebra, we arrive at 



h \ mlx * i A i 1/2 w 2 ) 



(A.10) 



The series coefficients are recognized to be precisely those of eq. (4.9). 
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In the last part of the proof, we derive the recursion fulfilled by the coefficients c k - m . To this 



aim, the m th derivative of ip k has to be evaluated at z = and then identified with ml c k - r 
key observation is that differentiating if) k reproduces ip k itself, that is to say 



The 



ij' k (z) = v k (z)i/> k (z) 



with 



(A.11) 



(A.12) 



and the auxiliary coefficient e k -i being defined as in eq. (4.18). Eq. (A. 11) lies at the origin of 
the recursion. Indeed, from eq. (A.ll) it follows that that t/j'l is a function of ip' k and ip k , viz. 
= ^'k^k + ^feV'fc- Proceeding analogously yields the general formula 



/ ( m ) 



m-l , x 
r=0 V 7 



(m- 



(A.13) 



At z = 0, this reads 



to! c fc;i 



m— 1 

E 

r=0 



(to 



(to 



-l)!r! fc 



(0) r! c fc;; 



The last step consists in proving that 



*i m) (o) 



1 



m\g k]m +i 



(A.14) 



(A.15) 



with g k - m defined as in eq. (4.16). This can be done precisely as explained in ref. [2]. 



Having reiterated Ruben's proof explicitly in a specific case, it is now easy to see how the 



theorem is extended to any other Gaussian integral. First of all, from eq. (A.l) we infer that each 
additional subscript in a k e m ._, enhances the power of the radial coordinate under the integral sign 
by 2 units. This entails a shift in the number of degrees of freedom of the chi-square distributions 
in Ruben's expansion, amounting to twice the number of subscripts. For instance, since aj k has 
two subscripts, its Ruben's expansion starts by F v+ 4, independently of whether j = k or j ^ k. 
In second place, we observe that in order to correctly identify the generating functions of Ruben's 
coefficients for a higher-order integral a/c£ m ..., we need to take into account the multiplicities of the 
indices k, £, m,. ... As an example, consider the case of ifjj k (j ^ k) and tp k k- By going once more 
through the argument presented in eq. (A.8), we see that eqs. ( A.6| )— ( A.7) are sufficient to show 
that eq. (4.13) is the generating function of aj k . By contrast, in order to repeat the proof for the 
case of ipkki we need an additional integral identity, namely 



-5/2 



1 



(27T)" 



-1/2 



dx x exp ( — —x' 



(A.16) 



valid once more for a > 0. Hence, we infer that ip k k must depend upon \ k via a factor of [1 — (1 — 
s/ Afc)z]~ 5 / 2 , whereas V^fc (j 7^ k) must depend on Aj and via factors of resp. [1 — (1 — s/Aj)z] -3 / 2 
and [1 — (1 — s/Afc)z] -3 / 2 . The different exponents are ultimately responsible for the specific values 
taken by the auxiliary coefficients e k k\i and ej k -i of eq. (4.19). 
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To conclude, we observe that the estimates of the residuals 1Zk ]m and 1Zjk ]m , presented in sect. 4 
without an explicit proof, do not require any further technical insight than already provided by 
ref. [2] plus our modest considerations. We leave them to the reader, since they can be obtained 
once more in the tracks of the original derivation of lZ m . 
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